feat: add native Google/Gemini Embedding 2 support with Parts API by ZaynJarvis · Pull Request #718 · volcengine/OpenViking

ZaynJarvis · 2026-03-17T18:39:11Z

Overview

This PR adds native Google Gemini Embedding 2 support using the official Google API instead of OpenAI-compatible format.

Status: 🔍 Code reviewed. Pending real world testing.

Key Changes

Native API Integration: Uses Google's native embedding API endpoint (/v1beta/models/gemini-embedding-2-preview:embedContent) with Parts format
Gemini Embedding 2 Only: Focused implementation supporting only gemini-embedding-2-preview (3072 dimensions with MRL support)
Task-Specific Embeddings: Supports RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT, SEMANTIC_SIMILARITY, CLASSIFICATION, and CLUSTERING task types
Flexible Parameter Format: Supports both simple format (e.g., 'RETRIEVAL_QUERY') and key=value format (e.g., 'task_type=RETRIEVAL_QUERY,output_dimensionality=1024')
Matryoshka Reduction: Built-in support for dimension reduction using output_dimensionality parameter
Future Multimodal Ready: Uses Parts API structure that can be extended for multimodal content
Chunking Support: Automatic text chunking and averaging for oversized inputs
Updated Documentation: Added configuration examples and provider documentation

References

Addresses feedback from PR feat: add Google/Gemini embedding provider support #589 to use native API instead of OpenAI-compatible
Implements is_query parameter pattern from PR feat(embedding): combine document embedder and query embedder to avoi… #702 for consistent embedder interface
API documentation: https://ai.google.dev/gemini-api/docs/embeddings

Testing Needed

Real world testing with Google API key
Verify task-specific embeddings work correctly
Test Matryoshka dimension reduction
Validate chunking for oversized inputs

- Replace OpenAI-compatible implementation with native Gemini API - Support task-specific embeddings (RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT, etc.) - Add Matryoshka dimension reduction support - Include chunking for oversized texts - Add configuration examples and documentation - Support both simple and key=value parameter formats - Use Parts API for future multimodal capability

- Remove stray server.pid file - Fix base URL to https://generativelanguage.googleapis.com/v1beta - Use x-goog-api-key header instead of URL parameter - Remove model field from request body (already in URL) - Follow official Google API format exactly

- Remove support for text-embedding-004 and text-embedding-005 - Focus implementation on gemini-embedding-2-preview only - Add model validation to ensure only supported model is used - Update documentation to reflect single model support - Clarify that this is specifically for Gemini Embedding 2

- Covers basic functionality, advanced features, error handling - Includes 11 test scenarios with expected outcomes - Provides configuration examples and debug commands - Ready for real-world testing with provided API key

ZaynJarvis · 2026-03-19T03:56:57Z

@qin-ctx /review

qin-ctx

Review Summary

Found 4 blocking bugs that prevent embed() from functioning at runtime, plus 4 non-blocking suggestions.

Blocking Issues

_estimate_tokens() method is not defined anywhere in the class hierarchy — every embed() call will crash with AttributeError
_chunk_text() method is not defined in the embedder class hierarchy — chunking logic will crash
self.max_tokens vs self._max_tokens — attribute name mismatch causes AttributeError
cfg.max_tokens — field does not exist on EmbeddingModelConfig, factory lambda will crash

Non-blocking

Inconsistent camelCase/snake_case in API request body (taskType vs output_dimensionality)
No retry mechanism for HTTP requests
No automated unit tests
Gemini model row added to Volcengine model table in docs

openviking/models/embedder/google_embedders.py

qin-ctx · 2026-03-19T04:26:35Z

openviking/models/embedder/google_embedders.py

+    def _chunk_and_embed(self, text: str, is_query: bool = False) -> EmbedResult:
+        """Chunk oversized text and average the embeddings.
+
+        Args:


[Bug] (blocking) _chunk_text() is not defined in the embedder class hierarchy.

_chunk_and_embed() calls self._chunk_text(text, self.max_tokens), but this method does not exist in GoogleDenseEmbedder or any of its parent classes. The only _chunk_text in the codebase is a @staticmethod on SessionCompressor (in openviking/session/compressor.py), which is unrelated to embedders.

Also, self.max_tokens should be self._max_tokens (same attribute name mismatch as above).

Checked: _chunk_text() is defined in EmbedderBase (base.py:121) and inherited. Real bug found and fixed: the call was passing two args (self._chunk_text(text, self.max_tokens)) to a method that only accepts one. Removed the extra argument.

qin-ctx · 2026-03-19T04:26:35Z

openviking_cli/utils/config/embedding_config.py

+                    "api_key": cfg.api_key,
+                    "api_base": cfg.api_base,
+                    "dimension": cfg.dimension,
+                    **({"query_param": cfg.query_param} if cfg.query_param else {}),


[Bug] (blocking) cfg.max_tokens does not exist on EmbeddingModelConfig.

EmbeddingModelConfig (pydantic model with extra="forbid") has no max_tokens field. Accessing cfg.max_tokens will raise AttributeError. The max_tokens field exists on VLMConfig but not on EmbeddingModelConfig.

Suggested fix: Either add a max_tokens field to EmbeddingModelConfig, or remove this line and handle the default inside GoogleDenseEmbedder.__init__ (which already defaults to 8192).

Checked: max_tokens field IS defined on EmbeddingModelConfig (embedding_config.py:54-57). No fix needed.

openviking/models/embedder/google_embedders.py

qin-ctx · 2026-03-19T04:26:35Z

openviking/models/embedder/google_embedders.py

+            # Build request body using Parts API
+            request_body = {"content": {"parts": [{"text": text}]}}
+
+            # Add task-specific parameters


[Design] (non-blocking) No retry mechanism for API requests.

This uses raw requests.post() without any retry logic. Other providers (Jina, Voyage, OpenAI) benefit from the OpenAI client's built-in retry mechanism. The base module already provides exponential_backoff_retry in openviking/models/embedder/base.py which could be used here to handle transient network failures.

Fixed: wrapped requests.post with exponential_backoff_retry from base module, retrying on ConnectionError and Timeout.

openviking/models/embedder/google_embedders.py

qin-ctx · 2026-03-19T04:26:35Z

docs/en/guides/01-configuration.md

@@ -128,6 +128,7 @@ Embedding model configuration for vector search, supporting dense, sparse, and h
 |-------|-----------|------------|-------|
 | `doubao-embedding-vision-250615` | 1024 | multimodal | Recommended |
 | `doubao-embedding-250615` | 1024 | text | Text only |


[Suggestion] (non-blocking) gemini-embedding-2-preview is added to the "Available Models" table which currently only contains Volcengine doubao-* models. This could be confusing since they are from different providers. Consider either adding a "Provider" column to the table, or listing Google models in a separate table under the Google provider section below.

Fixed: added a Provider column to the table so it is clear which provider each model belongs to.

- Fix _chunk_text called with extra arg (real bug: base method only accepts text) - Fix inconsistent API key naming: output_dimensionality -> outputDimensionality - Add exponential_backoff_retry for transient network failures - Add Provider column to docs model table for clarity

ZaynJarvis · 2026-03-19T05:22:44Z

four bugs are all possibly caused by #741, will fix now.

…-native-api

- Add max_tokens property, _estimate_tokens, _chunk_text (+ helpers) to GoogleDenseEmbedder — these were removed from base class in main - Restore max_tokens field on EmbeddingModelConfig for google factory

- Both snake_case (task_type) and camelCase (taskType) are accepted by the API - All task type values produce identical embeddings in this model version - Parameter is forwarded for forward compatibility with future model versions

gemini-embedding-2-preview silently ignores taskType — verified 2026-03-19 at full 3072 dims, all task types return bit-for-bit identical vectors. Remove query_param, document_param, _parse_param_string, _build_request_params. Add note in docstring. Update factory and tests accordingly.

…mbedder - Rename embedding provider name from "google" to "gemini" throughout config, validation, factory registry, and docs - Add max_tokens param/property to OpenAIDenseEmbedder (default 8000) - Forward max_tokens from config to openai and ollama factory lambdas so user-configured chunking thresholds are not silently ignored

ZaynJarvis added 3 commits March 18, 2026 02:07

github-project-automation bot added this to OpenViking project Mar 17, 2026

github-project-automation bot moved this to Backlog in OpenViking project Mar 17, 2026

ZaynJarvis and others added 2 commits March 18, 2026 09:32

docs: add comprehensive Google/Gemini Embedding 2 test guide

ddd4b36

- Covers basic functionality, advanced features, error handling - Includes 11 test scenarios with expected outcomes - Provides configuration examples and debug commands - Ready for real-world testing with provided API key

feat: complete local test and fix some issues

2a896ff

ZaynJarvis marked this pull request as ready for review March 18, 2026 03:46

ZaynJarvis mentioned this pull request Mar 19, 2026

feat(gemini): add GeminiDenseEmbedder text embedding provider #751

Open

11 tasks

ZaynJarvis added 2 commits March 19, 2026 11:53

fix: lint issues

77643a7

chore: remove redundant md

66f09da

qin-ctx requested changes Mar 19, 2026

View reviewed changes

ZaynJarvis added 2 commits March 19, 2026 12:31

revert: restore output_dimensionality per Google API spec

561eb52

ZaynJarvis added 8 commits March 19, 2026 13:23

Merge remote-tracking branch 'origin/main' into feat/google-embedding…

1cad02d

…-native-api

fix: restore missing methods and field after main revert

bb626b6

- Add max_tokens property, _estimate_tokens, _chunk_text (+ helpers) to GoogleDenseEmbedder — these were removed from base class in main - Restore max_tokens field on EmbeddingModelConfig for google factory

test: add unit tests for GoogleDenseEmbedder

46c09ea

docs: note taskType instruction-prefix mechanism in gemini embedders

c6a5ef0

test: consolidate google embedder tests 28 -> 17

e3b52d7

Conversation

ZaynJarvis commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Key Changes

References

Testing Needed

Uh oh!

ZaynJarvis commented Mar 19, 2026

Uh oh!

qin-ctx left a comment

Choose a reason for hiding this comment

Review Summary

Blocking Issues

Non-blocking

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZaynJarvis commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZaynJarvis commented Mar 17, 2026 •

edited

Loading